AD-A077  636  NORTH  CAROLINA  UNlV  AT  CHAPEL  HILL  INST  Or  STATISTICS  F/G  12/1 

NONPARANETRIC  REGRESSION  BASED  ON  THE  CONCOMITANTS  OF  ORDFR  STA — FtC(U) 
SEP  79  G  JOHNSTON  AF0SR-75-?796 

UNCLASSIFIED  MIMEO  SER-1299  AFOSR-TR-79-1099  NL 


SECURITY  CLASSIFICATION  pr*TH|S  PAGE  (When  nat^Enlored) 

(j/^REPORT  DOCUMENTATION  PAGE 

V  REPORT  - - '  '  '.  '  '\Vtz.  GOV 

q  ■  AFOTrStR--  7  9—  1  f)  9  9> 

4.,  TITLE  ('and  Subtitle) 

Jp- - — — - * - — - 

1  Nonparametric  Regression  Based  on  the 
:  Concomitants  of  Order  Statistics, 

~7  AyTHOR'tJ 

3“  / . 

Gordon  ^Johnston 


PAGE 

READ  INSTRUCTIONS  • 

BEFORE  COMPLETING  FORM 

2.  GOVT  ACCESSION  NO. 

3  RECIPIENT'S  CATALOG  NUMBER 

Interim/ 


6.  PERFORMING  07G.  REPORT  NUMBER 

Mimeo  Series  No. 1249 _ 

(  8.  CONTRACT  OR  GRANT  NUMBER^*) 

*^AF0"SR-75-2796- 


9  PERFORMING  ..'PGAUI  z  ation  name-and  address 

Department  of  Statistics  ' 
University  of  North  Carolina 
Chapel  Hill,  North  Carolina  27514 

II.  CONTROLLING  OFFICE  NAME  ANO  ADDRESS 


Air  Force  Office  of  Scientific  Research/-/'^  / 
Bolling  AFB,  DC  20332 _ (_ _ 

1*  MONITORING  AGENCY  name  6  ADDRESSfK  dilloront  Irom  Controlling  Ollico) 


It 


10.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  4  WfORK  UNIT  NU^ERS 

61102F  f  23^4/A5  ] 

12.  REPORT  DATE 

Sep€HH9Ee79 

11.  NUMBER  OF  PAGES 

14 _ 

IS.  SECURITY  CLASS,  (o I  Nil*  rrport) 

UNCLASSIFIED 


IS«.  DECLASSIFICATION  DOWNGRADING 
SCHEDULE 


16.  DISTRIBUTION  STATEMENT  (ol  thlt  Roport) 


Approved  for  Public  Release:  Distribution  Unlimited 

(ltf)  fnlKlFB  S£V?  -J- 


/f  -  j®/// 


flT  DISTRIBUTION  STATEMENT  (ol  the  •hairnet  entered  In  Block  20,  It  different  from  Report) 


I  18  SUPPLEMENT ARY  NOTES 


[  19.  KEY  WORDS  (Continue  on  reverse  aide  II  neceaanry  end  Identify  by  block  number) 


Regression,  nonparametric  estimation,  density  estimation,  concomitant  order 
statistics,  Gaussian  processes 


12(1  ABSTRACTTConr/ruiaonrararVp  ^ Vide  If  neceeemry  end  Identify  by  block  number) 


We  investigate  tjie  properties  of  nonparametric  regression  function 

“ - 


0  IJAN^J  1473  EDITION  OF  1  NOV  6S  IS  OBSOLETE 


UNCLASSIFIED 


NONPARAMETRIC  REGRESSION  BASED  ON  THE  CONCOMITANTS  OF  ORDER  STATISTICS 


by 

Gordon  Johnston* 


Abstract 


We  investigate  the  properties  of  nonparametric  regression  function 
estimates  based  on  the  concomitants  of  order  statistics. 


Key  Words  and  Phrases: 


Regression,  nonparametric  estimation,  density 
estimation,  concomitant  order  statistics,  Gaussian 
processes. 


D  D  C 

r?fK>r?m  ni 

DEC  3  1879 

tEtsEinn 

B 


AIR  FORCE  OFFICE  OF  SCIENTIFIC  RESEARCH  (AFSC) 
NOTICE  OF  TRANSMITTAL  TO  DDC 
Th :  technical  report  has  been  reviewed  and  is 

a.  proved  for  public  release  IA»V  AFR  190-12  (7b) 
Distribution  is  unlimited. 

A.  D.  BLOSE 

Technical  Information  Officer 


This  work  was  supported  by  the  Air  Force  Office  of  Scientific  Research  under 
Contract  AFOSR-75-2796. 


I 


1.  Introduction. 

S.  S.  Yang  (1977)  proposed  as  an  estimation  of  the  regression  function 

m(u)  =  E  [Y  |  X  =  u]  of  a  bivariate  random  vector  (X,Y)  the  statistic  M  defined 

n 

by 

i/n-F  (u) 


,  »  (  i/n-F  ( 

H„(u)  -  <ne„f  l  K  — ^ 
1=1  ^  n 


[i:n] 


Here  fcn1K(x/en)}  is  a  6-function  sequence  of  kernel  type  (Watson  and  Leadbetter 
(1964))  (X..Y.),  i=l,...,n  are  i.i.d.  observations 

on  (X,  Y) ,  Fn  is  the  empirical  distribution  function  (EDF)  of  the  X-observations, 
and  Y^.nj  Is  tbe  Y-observation  corresponding  to  the  i-th  order  statistic  of  the 
X-observations,  i.e.,  the  i-th  concomitant  of  the  X-values  n  (see,  e.g.,  Yong 
(1977)). 

Our  purpose  here  is  to  find  conditions  under  which 
(1.1)  (nen  log  n)* 

/  -x 

•+  E  as  n  +  »,  where  E  is  a  random  variable  with  density  e~2e  ,  x  >  0, 

a,  b,  are  constants,  {en>  and  (d^ )  are  appropriate  real  sequences  and 
2 

s(u)  =  E [Y  | X  »  aj .  Bickel  and  Rosenblatt  (1973)  proved  a  similar  result 

for  kernel  estimates  of  a  density  function.  A  large  sample  confidence  inter¬ 
val  for  m(u),  based  on  Mn(u)  given,  using  (1.1). 

We  also  give  conditions  under  which 
(!-2)  (nen)*  [Mn(u)  -  m(u)  ]  +  N(0,  s(u)  /  k2(t)dt)  as  n 


— 

(nen^  (Mn(u)  -  m(u)  1 

- 

sup 

a<uib 

ls(u)  /  k2(t)dt]V 

-  d 

n 

ACCESSION  for 


for  appropriate  points  u  and  sequence  {e^}. 

Our  method  of  proof  is  to  show  that 

(1.3)  (ne  log  n)*  sup  | M  (u )  -  M**  (u) |  >  0, 
asu^b  n 
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where  M**  is  defined  by 
n  ' 

1  n 

(1.4)  M *  * ( u )  =  (nen)  l  VjK((F(X)i)  -  F(u))/en). 

i  =  l 

M**  is  a  special  case  of  the  regression  function  estimation  proposed  by 
Watson  (li>b4).  Johnston  ( 1  ‘)7f» >  gives  conditions  under  which  (1.1)  and  (1.2) 
hold  for  i4**  in  place  of  M  ,  and  (1.1)  and  (1.2)  will  thus  hold  by  virtue  of 
(1.3). 

2.  Asymptotic  Equivalence  of  and  M**. 

In  this  section  we  verify  (1.3).  The  proof  is  given  in  the  Appendix 
since  it  is  rather  technical  and  lengthy.  Define 

l  n 

M*  (u)  (n.  l  Y.K((ln(X.)  -  P(u))/en) 

i  =  l 

Then  Lemma  2.1  gives  conditions  under  which 

(2.1)  (nf.n  log  n)'  sup  |M*  (ii)  -  Mn(u)|  ^  0 

au-^b 

(2.2)  (nt()  log  n )  "  sup  |M**(u)  -  M*(u)|  -►  0  , 

a-  ir  b 

which  together  imply  (1.3). 

Lemma  2 .  1  Suppose  (f.^  K(x/r  )}  is  a  6-function  sequence  such  that 
(log  n)  (ncn?)  *  °  K  has  hounded  support  and  3  bounded  continuous  deri¬ 

vatives  on  the  support.  Suppose  /  | K" ( t ) | dt  <  m  and  K  and  K’  are  of  bounded 
variat ion. 

Let  (X,Y)  be  such  that  1. 1 Y  |  <  ,D,  g(u)  -  F.|Y|X  =  F  *(u)]  has  2  bounded  deri- 
varives  on  [0,1 |  and  h(u)  =  I |(Y)|X  -  F*' (u) |  is  bounded  on  [0,1). 

Assume  there  exists  a  real  sequence  (an)  such  that  an  -*■  00 , 

2  3 

log  n/(m  )  *  0  and 

V  Y 

n‘  /  | y | dl  (y)  ►  0  as  n  _ 


Then,  for  0  <  1(a)  <  h’(b)  <  l,  (2.1)  and  (2.2)  hold.  0 

3_. _ Appl  icat  ions . 

We  will  assume  throughout  this  section  that  the  assumptions  of  Theorem  2.1 
are  in  force.  We  first  note  that  M**  may  be  written  as 

1  " 

M**(u)  =  (mn)  l  YiK((Z.-l-(u))/Gn) 

where 

z.  =  i' ( x . )  ~  ti(o,  rj. 

According  to  Iheorcm  2.5.2  ol  Johnston  (1979) ,  under  certain  conditions, 

(»"„)•  |  M*  *  ( u )  -  I:  ( Y  |  Z  -  I  (u) )  1  i  N((),  1:(Y2|Z  =  F(u))  /  K2(t)dt). 


If  we  assume  1;  to  he  strictly  increasing,  then 

H(Y|Z  =  1  (ii)  )  =  m(u) 


li(Y2|z  I  (ii  ) )  =  s(u). 


Thus  we  have,  by  virtue  of  (1.5) 


l"1 IM,,1"1  '  m(u)l 


►  N (0 ,  s(u)  /  K2(t)dt), 

which  completes  the  proof  of  normality  of  M^.  We  note  that  this  asymptotic 
variance  differs  from  that  of  Yong  (1977),  Theorem  b. 

If  the  conditions  of  Corollary  3.2.9  of  Johnston  (1979)  hold,  then 


(3.1)  (26  log  n);' 


sup 

lmn)i,Mn*(u)  ‘  m(u)1 

-d 

a  "'ll'  b 

( s (u) /  K 2 ( t ) d t  ]  ^ 

n 
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"e  ,  x  >  0.  Here  e  =  n 

n 


where  !•  is  a  random  variable  with  density  e 
5  <  ^  *•'  j  and  d  is  the  sequence  of  entering  constants  specified  in  Bickel 
and  Rosenblatt  (1973).  By  virtue  of  (1.3),  (3.1)  holds  with  replacing 
M** ,  as  we  wished  to  prove.  Inverting  (3.1)  in  the  usual  way  yields  an 
approximate  (l-«)  *  l()0*u  confidence  band  for  m(u)  over  the  interval  (a,b), 


based  on  M  (u) : 
n 


Mn(u)  i  (nen)':'(s(u)  /  k2(t)dt]^ 


d  ♦ 
n 


c(a) 


(26  log  n) 


where 


c(u)  -  log  2  -  log | log  (l-a)|. 
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AI'PLNDIX 


Proof  of  Lemma  2.1. 

We  begin  with  the  following  preliminary  lemma,  which  is  very  similar  to 
Lemma  1  of  Bhattacharyya  (.  1  U<»7 ) . 

AK  Lemma  Assume  that  g(u)  =  l:.  J  Y )  X  =  F  ^  (u) )  has  r  continuous  derivatives 
on  [0,1],  r  >  0,  and  that  K  has  bounded  support  and  r  bounded  derivatives  on 
the  support .  Then  for  a,  b  such  that  0  <  1(a)  <  F(b)  <  1, 

cn(rtl)  //  yK,r)(d  (x)  -  l  (z))/eii)dF(x,y)  =0(1) 


uniformly  in  z  <  [a,b|  as  n  -►  «>  . 


Proof.  Note  that 


- (r+ 1 ) 


//  yK(  r)  ( (1  (X  )  -  I  (z)  )/cn)  df'(x,y) 


=  entr+1)  LYK(r)((F(X)  -  l-(z))/en) 

=  t-(r>l)  J  m(x)K(,)((F(x)  -  ((z))/^)  dF(x) 


t  ( r  I 

-  <  i  1 J  J  g(u)K'  ((u-l(z))/e  )  du. 
n  0  n 


Now  wr i t  e 


cn(r+1)Rfu)l((r)  ((u'1  fz)  l/En> 


=  »  *1  Kfr)  (u)K(  (u-l  (z) )/.  n) 


‘  £  En(S  +  1)K,r  S‘l,(»,)Kls)((u-F(z))/ r  ) 


Hence 


f  * (r+ 1 )  /  g(u)K(r) ( (u-F(z))/e  }  du 

it  n 


n  0 


SUP  r-n  1  /  K(r,(»)IC((u-l'(z))/Li)  du 
z  0  11 
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+  sup 
z 


r-1 


£  (s  + 1 )..  (  r-s  -  1 ) ,  A  „  (s) 


s-0 


e  g 

n  h 


(u)K((u-F(z))/en) 


1 

u=o 


The  second  term  above  is  zero  for  large  n  since  the  argument  of  is 

eventually  outside  the  support  of  K.  Write 


sup 

z 


1 


1  t  (r), 
0 

(1-1  (z))/, 


cn  /  girj(ujK((u-l(z))/en)  du 


=  sup  |  /  "  K(v)g(rJ(r  v*F(z))  dv 

Z  1  -F(z)/en  n  . 


s  sup  |g(r)  (t)  I  /  |  K(v)  I  dv  <  <»  . 


0 


We  now  proceed  with  the  proof  of  Lemma  2.1.  It  is  convenient  to  rewrite 


Mn(u)  =  c’1  //  yK ( (l'n ( x)  -  Fn(u))/en)  dFn(x,y), 

and  similarly  for  M*  and  M* * .  Thus,  letting  Z^fx.y)  =  F'n(x,y)  -  F(x,y),  we 

may  write 

M* (u)  -  M  (u) 
n  n 


•  <'  //  y 


•  •;,1  n  y 


ic 

Fn(x)-F(t’) 

-  IC 

F  (x)-F  (u) 
n  n v  ’ 

cn 

—  h 

cn 

ic 

1  Fn(x)-F(u) 

-  IC 

r  Fn(x)-Fn(u) 

1  n 

Ln 

dZn(x,y) 


dF(x,y) 


'  P 

=  .Jj  ♦  .1.,,  say.  We  first  show  (ne  log  n)“’  |.J^|  ►  0.  Since,  by  assumption, 
K  has  3  continuous  derivatives,  we  may  write  (by  expanding  K((Fn(x)  -  F^iu))/^) 

about  (F  (x)  -  F (u) )/e  i 

n  n  ’ 

-l  (  I'  (x)-F(uJ 

J2  =  en  lFn(u)  ‘  F<U)J  //  >'K'  —7 -  I  dF(x,y) 


*  En  lFn(u)  ‘  ,(u^l  //  yK" 


Fn(x)-F(u) 


d  F  ( X ,  y ) 


_ w 


,0)  .  ,(2)  +  ,(3) 

,  say. 


*2  *  J2  *  s:'y*  where  wn(u)  is  between  b'n(u)  and  F(u) 

(  M*)  - 


(  i-Ui  -  I- HU) 

Now,  expanding  K’  [  - — - J  about  (F(x)  -  F(u))/en  yields 


(nenl°g  n) suj)  |j^ 
u 


-  (nenlog  n)'  sup  |Fn(u)  -  F(u)| 
u 

{  |  //  yf  (  t'iLrJiUO  )  dp(x.y)| 


v«>  - 

1  (X) 

_  will 

t. 

n 

Fnlx)  - 

Fix) 

_ _  wUIII 

r. 

y  k  f 

n 

dF(x,y) 


Vn(x,a)'|  I  1 

yK’M[  “V — I  dP(x’y)|  } 


where  vn(x,u)  is  between  F’n(x)  -  F(u)  and  F(x)  -  F(u). 

Using  the  fact  that  sup  |l;  (u)  -  F(u)|  =  0  (n~*)  and  applying  Lemma  A1  implies 

u  ' 

that  the  first  term  on  the  HI  IS  of  inequality  A1  goes  to  zero.  For  the  second 
term,  note  that 


cn'  //  h"  (  fiCX)--  -(U-i  ]  dF(x.y) 

x  Ln  J 


cn  /  h(t)  K' 


t-  -  I  (u) 

c 


o-f(u))/c 

=  /  11  |k"(v  J  | h (i.  V  ♦  l-'lu) )  dv, 

-r-(u)An 

which  is  a  bounded  sequence  since  h  is  bounded  and  K"  has  bounded  supports. 
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Thus  the  second  term  on  the  MIS  of  (Al)  is  equal  to 


;  _  2  _1 

(nen  loK  n)‘  rn“  ^p(n  )  0(1),  which  converges  to  zero  in  probability  if 
-V  2  3  . 1 

(nen  lo£  n)  /ncn  -*•  0,  i.e.,  if  (nen)(log  n)  •+  °°,  which  is  true  by  assumption, 
For  the  third  term  on  the  MIS  of  (Al)  note 


J 


yK'" 


vn(x,u) 


dF(x,y) 


s  sup  |  K'”  (v)  |  F  j  Y  |  < 

V 

(.  ^  / 2 

Thus  the  third  term  is  a  (nt(j  log  n)‘  0^(n  )  sequence,  and  converges 

-1  7/2 

to  zero  in  probability  since  (log  n)  ne^  <*>.  Similar  arguments  apply 


to  and  J^,  and  we  have  shown  (ne  log  n) e  sup  |d0|  0. 

We  now  turn  to  J^.  Let  la  1  be  a  sequence  as  specified  in  the  hypotheses 
and  write 

J1  =  tn1  /  /  yi;„U.u)Zn(dx,dy) 

|  y  |  >a 

+  L„1  J  /  y«(x,u)Z  (dx.dy) 


I- 


|y|'ar 


n  n 


,(1)  ,(2)  , 

=  dj  ♦  .Jj  ,  say,  where,  for  convenience,  we  write 


G 


f  FnU)  '  F(U)1  „ 

(•  F„(x)  -  Fn(u)^ 

1  V  1 

>  en 

Using  integration  by  parts,  write 

,(2)  -1 


*1  =  Gn  J  f  zn(x»y)  dyU  (dx.u) 

m<a„ 


pa  go  P 


-1  n 

lim  fn  /  Gn(t'ul 


t  >-">  -a 


a 


-  lim  en1  /  Gn(t,u)  yZfi(t  ,dy) 
t  o  “  a 

n 


+  rn  :ln  /  Zn(x'an)  Vdx*u) 


+  Gn 1 an  /  Zn(x'-an)  Gn(dx*uj 


=  l\  *  l2  + 1 3  +  U  *  lS’  S;,y- 


Since  Zn(-®,y)  =  0  for  each  n  and  y,  it  is  easily  ascertained  that  1^  =  0  for 
each  n  (c.g.  Natanson  (196'1),  p  233).  Similarly, 

a 

.  n 

l2  =  !2(u)  =  lim  Gn  /  >’<1Q  (y) 

t  x“  -a 


whe  re 


Now 


Q  ly )  =  lim  Zn(t,y)  =  lnY(y)  -  FY(y). 


a 

/”  y..:„(y)  .?,{»i « ,.a  ,a  |<V  -  ..  l(V)j. 

-a  i=l v  1  n  n*  1  n  n'  > 


V"'‘> 


as  n  -*•  «•  by  standard  central  limit  theorem 
arguments,  further,  using  the  mean  value  theorem, 

1  -  F  (u)’ 


lim  r,  (t,u)  =  k 


t  X» 


1  -  Ffu) 

n 


1  -  h  lull 

-  k - n— 

Cn 


Fnfu)  -  F (u)  f  1  ♦  q„(u) 
K 

c 


" . '■  -  cp:n-S 


n 


n 


uniformly  in  u,  where  qn(u)  is  between  F^(u)  and  F(u). 


Thus 


(nenloR  n)v  sup  | I  (U) | 


_  2 

=  Vn  (nS,Io«  »)*  «?p Cn" 1 )  -l  0 

since  a*  loK  n/ne*  -*  ()  by  assumption. 

As  the  final  step  in  t ho  proof,  we  must  verify  that 
Slll>  *  0.  Note  that 


>  o  ,  -K  l» 


(A2J  cn|jj  3|  <  |  /  /  yll  (x,u)  dF  (x,y)| 

i y i >a„ 


(ncnIog  n) * 


+  I  /  /  y«(x,u)  dF  (x,y) I 

|yi>a„ 


For  the  first  term,  note 


/  /  y(:„(x.u)  1,1 n (x,y)  | 

I  '>ii 


As  before, 


1/1 


s  sup  |r,(i(x,„,|  /  |y|  dF*  (y). 

x,u  i y i 


SUP  l(:n(x*,l3l  =  %  Vn"'^ 

X,ll  "  P 
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and 


|ypa, 


Now,  by  the  Markov  inequality,  for  any  i;  >  0 
P-  \Sn  I  |y|  <JF*(y)  |  >  el 

I  y  I >a.,  J 


-l 


c  E  |  /n  /  |  y  |  ilF  (y)  I 


|y|>a 


=  &  I  \y\  dF T(y)  -  0 
lyl>an 


by  assumption,  and  thus 


/  |y|  Ml  Cy)  =  0  (n_V) . 

I y I >a  1 

i  /  i  n 

A  similar  argument  applies  to  the  second  integral  on  the  R1IS  of  (A2)  and  we 
thus  have 

(nenlog  n) •"  sup  |.lj^  (u)  | 
u 

(nenlog  n)*’  if  2  (^(n-1)  ►  0 

since  ne^/log  n  ■*  <»  by  assumption.  □ 

The  proof  of  (2.2)  follows  a  similar  pattern,  and  wc  omit  the  details. 
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